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Abstract 

Curve  Resolution  and  the  Generalized  Inverse  Method  are  used 
to  calculate  the  molar  compositions  of  mixture  mass  spectra  acquired 
during  coelution  of  unseparated  GC  peaks.  Quantitative  resolution 
of  the  GC  signal,  independent  of  peak  shape,  results.  The  effects 
of  noise,  peak  separation,  a  drifting  base  line  and  peak  tailing 
are  studied.  No  assumptions  of  peak  shape  or  the  presence 
of  unique  masses  for  the  pure  components  are  required. 
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INTRODUCTION 

Combined  gas  chromatography-mass  spectrometry  (GCMS)  has 
become  an  Indispenslble  tool  of  analysis  in  many  fields.  Its 
popularity  stems  from  the  potential  to  analyze  very  complex  samples 
In  a  matter  of  minutes  and  from  the  availability  of  simple  to 
operate  computer-controlled  instruments  (1) .  Normally  GCMS  data 
contain  the  information  that  Is  needed  to  positively  Identify  and  quantify 
major  and  minor  constituents  in  samples(2,3) .  However  serious  problems 
arise  when  the  eluting  components  are  only  partially  separated  and 
accurate  qualitative  and  quantitative  analyses  become  quite  difficult. 
Fortunately,  qualitative  analysis  is  possible  when  the  parent  spectra 
of  the  chromatographically  unresolved  components  are  mathematically 
resolved (4) . 

Quantitative  analysis  of  single  component  peaks  by  GC  is  a  well 
documented  and  widely  used  technique  offering  general,  selective  and 
specific  detection  capabilities.  It  provides  the  analyst  with  an  excellent 
linear  dynamic  range  and  reasonable  precision  and  accuracy(3}.  With 
complex  samples,  however,  various  conditions  may  not  allow  complete 
resolution  of  all  peaks  which  must  be  resolved  before  their  areas 
or  heights  are  measured  for  quantitative  analysis. 

Linear  (geometric)  and  nonlinear  (curve  fitting)  methods  have 
been  developed  to  resolve  fused  GC  peaks (3, 6).  Linear  methods  suffer 

from  being  inaccurate  and  imprecise (6) .  Nonlinear  methods  are  more 
accurate  but  they  are  difficult  to  employ  as  they  require  several 
parameters  to  be  estimated  by  nonlinear  parameter  estimation  techniques. 
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Poorly  estimated  Initial  parameters  often  lead  to  either  divergence 
or  convergence  to  the  Incorrect  values (6).  The  accuracy  of  linear  and 
nonlinear  methods  is  determined  by  the  shapes  of  the  GC  peaks,  the  mathe¬ 
matical  peak  shape  function  used  and  detector  noise.  The  peak  shape 
models  adopted  for  a  particular  case  may  not  apply  if  conditions, 
concentrations  or  columns  are  changed(7-ll).  Consequently,  an  algorithm 
or  peak  shape  model  chosen  for  one  situation  may  have  to  be  revised 
for  another.  Two  techniques  relying  on  curve  fitting  for  resolving 
GCMS  data  have  been  described  (12,13). 

Smith  et.  al.  and  Blaisdell  described  two  similar  systems  for 
quantitative  and  qualitative  analysis  of  GCMS  dataClii^jlS).  Both  systems 
require  libraries  of  previous  analyses,  a  knowledge  of  masses  unique  to 
specific  components  and  long  computer  time.  These  characteristics  limit 
these  methods  to  special  purpose  analyses.  Knorr  et.  al.  demonstrated 
an  algorithm  that  models  the  time  varying  GC  peak  and  extracts  spectral 
information  by  a  least  square  procedure  without  a  dependence  on  unique 
masses(I6).  Successful  application  is  dependent  on  the  shape  of  the 
response  surface  whose  minimum  is  being  sought.  Lundeen  and  Juvet 
related  the  GC  response  to  a  polynomial  in  concentration(17).  The 
success  of  the  method,  however,  requires  that  the  analyst  chooses  the 
correct  polynomial,  knows  the  number  of  overlapping  components  in 
advance  and  guesses  a  good  starting  point  to  Insure  a  quick  convergence 
if  any. 


To  overcome  these  problems  the  mass  spectrometer  has  been  used 
as  a  specific  detector  for  the  GC(18).  The  mass  spectrometer  is  utilized 
in  a  selective  ion  monitor  mode  (SIM)  or  a  limited  ions  monitor  mode(LIM). 
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Both  methods  offer  high  sensitivity  and  selectivity.  The  techniques 
have  been  used  successfully  in  quantifying  chromatographically 
unresolved  compounds (19-21).  Complete  structural  information  about 
the  components  of  interest  and  other  constituents  in  the  sample  is 
lost  nevertheless.  SIM  and  LIM  are  very  explicit  techniques;  the  analyst 
must  obtain  the  spectra  of  the  analytes  prior  to  analysis  and  the 
absence  of  Interferences  must  be  assured.  Additionally,  the  selected 
ions  may  not  always  be  the  most  abundant  ones  and  their  response  may  not 
offer  a  useful  linear  dynamic  range.  While  SIM  and  LIM  have  been  success¬ 
fully  implemented  in  several  situations,  a  fruitful  analysis  scheme  must 
be  "staged"  in  advance.  Quantitation  by  SIM  and  LIM  suffers  from 
bad  precision  and  larger  variations  (compared  to  GC)  due  to  ion  statistlcs(22-24) . 
The  variance  of  SIM  quantitation  can  be  twice  that  of  employing  a  flame 
ionization  detector(25).  The  attractiveness  of  SIM  and  LIM  is  attributed 
to  speed  of  analysis  and  high  selectivity.  When  unique  masses  exist 
and  they  are  known,  there  is  no  need  to  resolve  GC  peaks  before  quantitation. 

A  system  for  GCMS  data  reduction  relying  on  SIM  and  LIM  detection  has 
been  described(26) . 

i 

i 

Since  GC  offers  the  linearity,  accuracy  and  precision  needed  for  I 

complex  sample  analysis,  it  is  desirable  to  extend  its  capability  to 
quantify  overlapping  peaks  without  relying  on  limiting  assumptions  (l.e. 
specific  ions) .  This  paper  Introduces  one  method  and  Illustrates  and 
compares  two  methods,  each  depending  on  multivariate  curve  resolutlon(4 , 27) , 
for  resolving  GC  peaks.  Both  methods  are  fast,  accurate  and  Independent 
of  the  shape  of  the  CC  peaks.  Most  Importantly,  they  do  not  require 
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peak  shape  modeling  or  the  estimation  of  any  peak  shape  parameters.  For 
simplicity,  the  methods  will  be  illustrated  as  applied  to  chromato- 
graphically  unresolved  binary  mixtures .  A  generalization  to  more  complex 
multicomponent  systems  is  in  progress. 

The  two  methods  offer  the  analyst  a  direct  solution  to  the  peak 
deconvolution  problem  and  ensure  that  the  complete  Instrtimental 
resolution  is  realized.  Moreover,  quantitation  accuracies  can  actually 
be  better  than  instrument  response  precision  due  to  a  degree  of 
signal  averaging  Inherent  to  multivariate  statistical  analysis. 


EXPERIMENTAL 


Instruments 

Experimental  data  were  obtained  using  a  Hewlett-Packard  5985  GCMS. 
The  system  consists  of  an  HP  5840  microprocessor  controlled  GC,  a 
jet  separator  and  a  hyberbolic  quadrupole  mass  spectrometer. 

Material 

Analytical  grade  reagents  with  a  boiling  range  of  0.5-1.0*C  were 
used  without  further  purification. 

Procedure 

The  mixtures  were  injected  in  a  6  ft  packed  glass  column.  The 
packing  material  used  was  3%  SE-30  on  100/120  mesh  Gas  Chrom  Q 
support.  The  resolution  of  the  effluents  was  varied  by  adjusting  the 
flow  rate  and  the  temperature  of  the  column  accordingly.  The  mass 
spectra  were  scanned  at  a  rate  of  1  scan  per  second  covering  the  range 
from  15-150  M/z. 

Computer  Programs 

A  Fortran  program  that  can  correctly  determine  the  number  of 
components  under  a  GC  peak  and  then  performs  qualitative  and  quantitative 
resolution  is  available  from  Infometrix,  Inc.,  P.O.  Box  25808,  Seattle, 

WA  98125.  The  analysis  of  a  data  matrix  containing  300  entries  (e.g. 
10x30)  requires  40K  byte  on  a  Z-80  micro-processor  based  micro-computer 
compiled  with  micro-soft  Fortran. 
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THEORY 


The  Generalized  Inverse  Method.  When  two  components  coelute  from  the 
GC  the  collected  mass  spectra  are  assumed  to  be  linear  combinations  of  the 
components  pure  mass  spectra.  At  every  point  in  time,  during  the 
emergence  of  the  combined  peaks,  the  eluting  mixture  is  composed  o  f  a(t) 
mole  fraction  of  the  first  component  and  3(t)  mole  fraction  of  the  second 
component.  The  intensity  of  the  GC  responses  are  also  considered  to 
be  non-negative  linear  combinations  of  the  responses  of  the  pure 
components.  The  combinations  in  both  cases  are  the  same.  If  N  mass 
spectra  are  collected  by  scanning  I  M/z  signals,  the  experimental  mass 
spectra  constitute  a  data  matrix,  X,  and  are  represented  by  the 

following  matrix  equation: 


J^N,I  -Sn,2  22,I 


(1) 


^  Vi 

where  Ci  .  is  the  mole  fraction  of  the  j  component  in  the  k 

scanned  spectrum  (the  eluting  mixture),  and  p^^  is  the  intensity  of 
tVi  til 

the  1  signal  in  the  j  pure  component.  If  the  pure  spectra  are  either 
known  or  can  be  estimated,  an  assumption  that  can  be  verified  by 
statistics,  the  solution  band  widths(4) ,  or  library  search,  then 


T  T  -1 

C  (PP  )  (2) 

T  T  T  -1 

where  P  is  the  transpose  of  the  matrix  P  and  P  (PP  )  is  the  so  called 

generalized  inverse  of  P.  This  solution  for  C  is  nothing  new.  It  is 

simply  the  multivariate  least  squares  solution.  It  is  used  here  as 


a  standard  benchmark  for  the  method  described  in  the  next  section. 
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The  time  varying  response,  i.e.  TIC,  FID,  at  every  point  in  time, 
i,  when  a  mass  spectrum  is  scanned  is  resolved  by  partitioning  the 
total  signal  according  to  c^^^  and  ratio  of  c^^^^  to  c^2 

denoted  by  R . ,  then  the  contributions  of  the  first  component,  S,  .,  and 
the  second  component,  S_  .,  to  the  total  GC  signal,  S  ,,  are 

^  A 1  kOw • X 


Si  [Ei/<14.Ri)]s^^^_^ 

^2,1  -  [1/(1  + 


(3) 

(4) 


If  can  be  estimated  for  each  mixture  spectrum,  then  quantitative 
resolution  in  the  time  domain  is  straightforward. 


The  Curve  Resolution  Method.  Factor  analysis  of  the  data  matrix,  JC, 
yields  two  principle  eigenvectors,  and  V2,  containing  all  the 
variance  associated  with  the  two  chemical  components(4) .  Each  experimentally 
scanned  mass  spectrum,  MS^,  is  represented  by  a  point,  (a^,b^),  in  a 
two  dimensional  (factors'  loadings)  space.  Coordinates  (a^,b^)  for 
each  point  are  given  by 


=  MS^.V^ 

(5) 

l> 

Is 

II 

(6) 

After  normalization  to  unit  area,  the  points  representing  all  N 

measured  spectra  lie  on  a  straight  line  in  factors'  loadings  space. 
This  is  shown  in  Figure  1  with  N  =  4. 

The  shaded  areas  in  Figure  1  represent  the  two  solution 

bands  that  contain  the  parent  spectra.  In  the  eigenvector  domain, 

each  solution  band  is  defined  by  an  "outer"  and  an  "inner"  spectrum. 

Thus,  points  and  (ay2’^u2^  represent  the  outer  spectra  of 
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the  first  and  second  solution  bands  respectively.  Similarily, 

and  (a^,b^)  represent  the  inner  spectra  of  the  solution  bands.  In 

the  spectral  domain,  however,  the  outer  and  inner  spectra  will,  for 

convenience,  be  referred  to  as  the  upper  and  lower  edges  of  the 

solution  bands  respectively.  The  two  points  designated  by 

and  (3y2’^u2^  Figure  1  correspond  to  the  spectra  of  the  upper 
edges  of  the  solution  bands(4) .  The  points  corresponding  to  the  spectra 

of  the  lower  edges  of  the  solution  bands  in  Figure  1,  (a^^.b^^)  and 

(a^,b^),  represent  the  "purest"  mass  spectra  scanned  during  elution. 

Note  that  these  points  ^  not  always  correspond  to  the  first  and  last 
scanned  mass  spectra.  Other  points  in  between  represent  non-negative 

linear  combinations  of  these  spectra.  Let  m  and  n  symbolize  the  purest 

two  spectra  and  let  i  be  any  middle  point  representing  a  mixture  of 

a'  mole  fraction  of  m  and  6'  mole  fraction  of  n.  The  following 

relationships  hold: 


Substituting  eqs.  8, 
eq.  7  gives 


MS,  = 

a'  MS 

+  B'  MS, 

i 

m 

MS 

»  a  V, 

+  b  V, 

m 

m  1 

m  2 

MS 

=  a  V, 

+  b  V„ 

n 

n  1 

n  2 

MS^ 

»aJl 

+  b^V2 

9  and  10  for  MS  ,  MS 
m  n 


(7) 

(8) 

(9) 

(10) 

MS^,  respectively,  in 


aiVi  +  b^V2 


(a'a  +  6'a  )V,  +  (ct'b^  +  S’b„)V, 
m  n  1  ra  n  J 


(11) 
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The  above  vector  equation  implies  that 


a.  =  a'a  +  B'a 
Iran 


(12) 


and 


b.  =  a'b  +  3'b 
iron 


(13) 


The  Euclidean  distance  between  point  m  and  point  i,  d^^ ,  in 


the  factors'  loadings  space  is  given  by 

2  2  2 

^ 

ml  ml  mi 

Substituting  for  a^  and  b^  from  eqs.  12  and  13,  eq.  14  can  be 


(14) 


rewritten  as 


Similarly, 


d^  i  =  3'^[(a  -  a  )^  +  (b  -  b  )^] 


(15) 


d^  =  a'^[(a  -  ay  +  (b^  -  b„)^] 
ni  m  n  m  n 


(16) 


Therefore, 


d  ./d  =  6 '/a' 

mi  ni 


(17) 


This  proves  that  the  ratio  of  the  distance  between  points  m  and  i 
to  the  distance  between  points  n  and  i  is  the  same  as  the  ratio  of 
the  mole  fraction  of  n  in  i  to  the  mole  fraction  of  m  in  i.  If  ra 
and  n  represent  the  spectra  of  the  pure  components,  an  assumption  that 
can  be  verified  as  mentioned  above,  then  a'  and  S'  are  also  the  true 
mole  fractions  a  and  8. 

If  m  and  n  do  not  represent  the  spectra  of  the  pure  components,  then 
a'  and  S'  are  only  estimates  of  the  true  mole  fraction  a  and  S.  The 
adequacy  of  these  estimates  is  determined  by  the  chromatographic 
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resolution  ns  well  as  tre  shapes  of  the  peaks.  The  better  the  resolution 
the  closer  a'  and  6'  to  a  and  3  respectively. 

If  we  let 


R.  =  d  ,/d  . 
i  mi  ni 


(18) 


Then  the  total  chromatographic  signal  at  point  i,  is  resolved 

into  the  contribution  of  the  first  component,  S  .,  and  that  of  the 


m,i 


second  component,  according  to 


-  [1/(1  +  V’^ot.l 


(19) 


and 


(20) 
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RESULTS  AND  DISCUSSION 

Simulated  Data.  The  coelution  of  a-pinene  and  plnane,  two  structurally 
similar  compounds,  is  represented  by  a  gamma  and  a  Gaussian  peak  shape 
profile  respectively.  The  choice  was  made  to  examine  the  effects  of 
tailing.  The  sjmmetric  Gaussian  profile  is  superimposed  on  the  "tall" 
of  the  unsymmetric  gamma  function.  Various  chromatographic  resolution 
situations  are  simulated  by  moving  the  Gaussian  profile  along  the  time 
axis.  The  separation  of  the  two  peaks  can  be  expressed  as  the  distance 
between  the  two  maxima  in  units  of  a,  the  standard  deviation  of  the 
Gaussian  peak,  which  is  also  equal  to  the  variance  of  the  gamma  peak  ' 
profile  (0=0.5).  Mass  spectra  sampled  as  a  function  of  time  are  assumed 
to  be  linear  combinations  of  the  literature  mass  spectra  of  pure 
components <28).  The  linear  combinations  are  determined  by  the  relative 
heights  of  the  theoretical  profiles. 

Figures  2  and  3  show  the  results  of  resolving  the  convoluted 
chromatographic  peaks  at  separations  of  0  and  4  a  respectively.  The 
calculated  curves  are  identical  to  the  theoretical  ones.  Results  of 
the  Curve  Resolution  Method  (CR)  and  the  Generalized  Inverse  Method 
(GI)  with  the  first  and  last  scanned  spectra  used  as  standards  are  the 
same.  The  situation  shown  in  Figure  2  represents  a  chromatographic 
resolution  of  zero.  The  procedures,  however,  rely  on  spectral  resolution 
and  chromatographic  resolution.  In  this  case,  spectral  resolution 
is  achieved  because  the  peaks  have  different  shapes. 

When  noise  is  added  to  the  simulated  mixture  mass  spectra  (a 
more  realistic  situation),  the  applications  of  CR  and  GI  yield  different 
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results.  Random  noise  of  1%,  3%,  5%,  7%  and  10%  was  added.  The  GC  peaks 

are  then  resolved  by  CR  and  GI  and  the  areas  under  the  calculated  curves 

are  compared  to  the  true  areas.  When  noise  is  present,  the  first  and 

last  acquired  spectra  are  badly  affected  because  of  the  low  total 

intensity.  An  advantage  of  CR  over  GI  can  be  quickly  realized  as  CR 

is  able  to  indicate  the  least  contaminated  spectra.  If  GI  is  to  be 

employed,  the  analyst  has  to  examine,  by  library  searches  or 

Inspection,  several  spectra  in  order  to  determine  the  purest  acquired 

ones.  In  contrast  CR  eliminates  the  need  for  such  validation  steps  necessary 

for  data  reduction.  In  the  following  analysis,  the  least  contaminated 

spectra  indicated  by  CR  are  used  as  standards  in  both  CR  and  GI.  Tables  I  and  II 

show  the  average  errors  of  30  random  perturbations  obtained  by  both  methods 

at  separations  of  0  and  4  a  respectively.  CR  is  more  "accommodating" 

to  noise  than  GI.  Errors  obtained  by  application  of  GI  can  be  twice 

as  high  as  chose  obtained  by  CR.  This  is  due  to  Che  fact  that  CR  calculates 

Che  GC  curves  in  an  eigenvector  space.  Factor  analysis  is 
a  signal  averaging  procedure.  It  is  a  powerful  statistical  means  of 

minimizing  the  effects  of  random  variations (29).  In  contrast,  the  curves 

calculated  by  GI  are  subject  to  an  error  propagation  that  is  dependent 

on  the  condition  number  of  the  data  matrix(30).  As  chromatographic 

separation  increases,  the  effects  of  random  noise  are  minimal. 

When  the  noise  level  is  very  high  (>10%)  CR  is  disadvantaged 

because  the  noise  becomes  a  "factor".  The  averaging  of  systematic 

changes  is  inadequate  in  accounting  for  the  total  variance  and 
significant  errors  are  introduced.  At  optimal  instrumental  operating 

conditions  where  random  variations  are  less  than  10%  CR  becomes  the  method 

of  choice. 

The  composition  of  the  eluting  mixture  was  varied  by  changing  the 
relative  areas  under  each  peak.  Two  typical  cases  are  Investigated 
where  the  two  maxima  are  separated  by  A  0.  In  the  first  case,  the 
Gaussian  peak  is  preceded  by  a  smaller  gamma  peak.  The  noise-free 
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data  are,  again,  perturbed  by  a  random  noise  of  1%  to  simulate  a 
real-life  laboratory  situation.  Table  III  summerizes  the  errors  in 
calculating  the  areas  under  both  curves  by  CR.  The  reported  deviations 
are  the  averages  of  30  different  perturbations.  The  Gaussian  profile 
does  not  cause  problems  as  it  predominates  the  total  chromatogram.  The 
combined  signal  is  accurately  resolved.  The  second  case  is  the  opposite 
of  the  first  one.  The  gamma  distribution  is  followed  by  a  smaller 
Gaussian  shaped  peak.  Table  IV  lists  the  average  errors  in  calculating 
the  areas  under  the  GC  peaks  by  CR.  The  errors  shown  are  the  averages 
of  30  1%  randomly  perturbed  data.  As  the  tailing  peak  predominates, 
the  Gaussian  profile  becomes  completely  "burled"  underneath  it.  No 
uncontaminated  mass  spectral  scans  are  obtained  for  the  component 
represented  by  the  Gaussian  profile  because  of  low  S/N  of  the  peak.  This 
"tailing"  prevents  accurate  reproduction  of  the  GC  signal  at  lower 
relative  compositions.  At  comparable  ratios  the  GC  signals  are  resolved 
successfully. 

A  drifting  base  line  during  the  elution  of  a  GC  single  peak  can 
prevent  accurate  quantitative  analysis.  The  effects  of  a  drifting  base  line 
are  not  restricted  to  detector  noise.  A  bleeding  column,  for  example, 
introduces  extra  mass  spectral  signals  that  preclude  correct  identification 
of  the  eluting  component  and  prevents  accurate  quantitation  in  the 
chromatographic  domain.  The  Information  pertaining  to  a  shift  in  the 
base  line  and/or  a  column  bleed  is  contained  in  the  Intensity  and 
spectral  data.  By  factor  analyzing  these  data,  CR  can  successfully 
separate  these  effects  from  the  information  associated  with  the  eluting 
peak.  Drifting  base  lines  can  be  treated  as  Interfering  components  using 


the  curve  resolution  approach  and  can  be  resolved  from  the  GC  signal. 
Two  kinds  of  drifting  base  lines  are  simulated  here.  The  first 
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is  a  linear  base  line  of  the  form  f(t)  =  and  the  second  is  a 

2 

nonlinear  basa  line  of  the  form  f(t)  =  kj^t  +  k2t  .  The  value  of  t  varied 
sequentially  from  1  to  N,  where  N  is  the  total  number  of  mass  spectra 
acquired.  In  both  cases  the  base  line  drift  accounted  for  3%  of  the 
total  peak  area.  The  simulated  data  are  perturbed  by  1%  and  3%  random 
noise.  Table  V  shows  the  results  of  resolvinf  the  single  peak  and  the 
base  line  by  CR.  The  deviations  in  Table  V  are  the  averages  of  30  random 
perturbation.  The  peak  and  the  base  line  in  both  cases  are  successfully 
resolved. 

Experimental  Data.  Figure  A  shows  the  coelution  of  n-heptane  and 
methylcyclohexane  at  three  different  degrees  of  resolution.  The 
chromatographic  response.  Total  Ion  Current  (TIC) ,  was  resolved  into 
two  peaks,  one  for  n-heptane  and  the  other  is  for  methylcyclohexane,  by 
CR  and  GI.  Methylcyclohexane  and  n-heptane  have  specific  signals  at  M/z=98 
and  M/z=100  respectively.  These  specific  signals  were  regarded  as 
"internal  standards"  because  their  responses  are  completely  resolved 
regardless  of  the  chromatographic  resolution.  In  order  to  make  the 
problem  much  more  difficult  for  CR,  these  specific  ion  signals  were 
exculded  from  the  data  matrix  when  the  GC  peaks  were  resolved.  The 
areas  of  the  resolved  GC  peaks  were  then  compared  to  the  areas  of  the 
specific  mass  chromatograms  (MCs)  in  all  three  cases  shown  in  Figure  A. 

Table  VI  lists  the  areas  of  the  resolved  GC  peaks  and  those  of  the  MCS 
of  the  specific,  but  not  the  most  abundant,  mass  spectral  signals  for 
n-heptane  and  methylcyclohexane.  The  analytical  reliability  of 
the  areas  of  the  resolved  GC  peaks  is  evaluated  by  correlating  them  to 
the  areas  of  the  specific  MCs.  The  Fisher  transformation  is  used  to 
establish  the  99%  confidence  intervals  of  the  correlation  coefficients  (31). 
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Tables  VII  and  VIII  list  the  regression  parameters  and  the  corelation  coefficients 
and  their  99Z  confidence  intervals  for  the  areas  of  the  specific  MCs 

and  the  resolved  GC  peaks  for  n-hepcane  and  methylcyclohexane  respectively. 

The  high  correlations  between  the  areas  of  the  resolved  GC  peaks 

and  the  areas  of  the  always  resolved  MCs  are  evident.  There  is  no  loss 

of  information  upon  application  of  CR  and  GI  to  the  convoluted  GC  signal. 

The  regression  parameters  for  each  component  remain  virtually  unchanged 

even  though  the  chromatograhic  resolution  is  worse  by  a  factor  of  5 

^:ase  c  vs.  case  a).  This  and  the  high  correlations  indicate  that  in  all 

three  cases  the  GC  peaks  are  accurately  resolved  and  that  the  resolved 

GC  signals  are  as  analytically  reliable  as  the  specific  MCs.  The 

potential  of  multivariate  curve  resolution  is  thus  evident  as  it  allows 

the  GC  response  to  furnish  information  equivalent,  but  with  better 
precision,  to  that  of  a  specific  detector.  Note  that  the  specific  signals 

were  excluded  from  the  data  matrix  before  analysis.  This  is  a  clear 

advantage  of  the  curve  resolution  method  over  SIM,  LIM  and  other  methods 

that  rely  on  specific  MCs. 

Additionally,  CR  requires  the  analyst  to  collect  multivariate  data. 

This  improves  the  analytical  signal,  reduces  the  effects  of 

random  background  variations  and  detector  noise  and  Increases 

the  amount  of  information  acquired  during  the  experiment.  Moreover, 

CR  (as  shown  above)  does  not  require  specific  signals  to  be  present. 

A  very  important  feature  of  CR  is  that  it  indicates  the  purest  scanned 

mass  spectra.  Since  these  spectra  are  not  always  the  first  and  last 

scanned  spectra,  CR  presents  a  valuable  tool  for  choosing  the  mass 

spectra  most  suitable  for  library  searches. 

In  the  present  example,  both  CR  and  GI  have  performed  equally  well. 

The  data  do  not  show  any  significant  differences  in  the  results.  A 

t-test  shows  that  the  differences  in  the  areas  of  the  n~heptane  peaks 
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as  calculated  by  CR  and  G1  are  insignificant  at  the  0.05  significance 
level.  The  same  is  true  for  the  methylcyclohexane  peaks.  However, 

CR  is  generally  preferred  because  of  its  ability  to  accommodate 
experimental  noise.  In  GI  the  accuracy  is  determined  to  a  large  extent 
by  the  condition  number  of  the  data  matrix;  the  more  similiar  the 
parent  spectra,  the  higher  the  condition  number. 

The  accuracy  of  curve  resolution  depends  upon  at  least  three 
limitations.  First,  the  inherent  instrumental  resolving  power  imposes 
a  limit  that  is  not  possible  to  exceed  via  data  analysis  methods. 

A  good  example  is  the  extreme  case  of  two  components  eluting  from 
the  column  at  exactly  the  same  time  with  the  same  peak  profiles:  All 
mixture  spectra  are  identical  in  this  case  and  resolution  is  impossible. 

The  next  limiting  factor  is  the  uniqueness  of  the  spectra  of  the 
eluting  compounds.  If  the  spectra  contain  at  least  one  unique  mass  each 
(it  is  important  to  realize  that  the  unique  masses  are  not  known  a 
priori) ,  then  the  pure  spectra  will  lie  on  the  upper  edges  of  the  solution 
band£(4).  If  the  spectra  of  the  eluting  compounds  are  quite  similiar  and 
contain  no  unique  masses,  the  upper  edges  of  the  solution  bands  will  not 
represent  the  true  parent  spectra.  This  deviation  will  be  reflected 
in  the  band  widths  of  the  solution  bands.  In  this  case  finite  band  widths 
will  be  obtained  regardless  of  the  chromatographic  resolution  (see 
case  III  below) . 

The  angle  related  to  spectral  uniqueness  is  formed  by  the  inter¬ 
section  of  the  upper  edges  of  the  solution  bands  at  the  orgln  in 
Figure  1.  When  the  spectra  of  the  pure  components  are  Identical 


this  angle  is  zero  and  no  resolution  is  possible.  At  the  other 
extreme,  the  angle  approaches  90°  as  the  spectra  are  more  dissimilar. 
If  all  signals  in  the  parent  spectra  are  specific,  this  angle  becomes 
90°. 

The  third  limitation  is  the  rate  of  spectral  data  acquisition. 

It  is  desirable  to  collect  as  many  mass  spectra  as  possible  during  the 
time  the  compounds  are  eluting.  The  three  benefits  that  occur  are 
improved  signal  to  noise  ratio  (more  spectra  are  ’'averaged"X  better 
integration  accuracy  (more  points  are  available  on  the  peak  profile) 

and  the  acquisition  of  mixture  spectra  (the  two  spectra  that 
represent  the  lower  edges  of  the  solution  bands)  that  are  closest  to 

pure  spectra.  In  practice  the  speed  of  scanning  is  determined  by  the 
instrumental  background  and  the  desired  quality  of  the  acquired  mass 
spectra.  The  following  section  describes  the  best  procedures  for 
qualitative  and  quantiative  analyses  for  possible  situations  that  the 
analyst,  or  the  computer  program,  may  face. 

Case  I.  The  widths  of  the  solution  bands  are  equal  to  zero:  This 
is  the  most  favorable  situation.  It  is  an  indication  that  specific 
signals  are  present  and  that  pure  mass  spectra  have  been  acquired. 

The  analyst  can  use  the  pure  spectra  for  qualitative  analysis  and  the 
factor  loadings,  i.e.  (a^^.b^^)  and  (a^,b^)  in  Figure  1,  to  calculate 
the  molar  compositions.  No  assumptions  are  required. 

Case  II.  The  widths  of  the  solution  bands  are  not  equal  to  zero  and 
specific  signals  are  knovm:  This  is  an  Indication  that  aU  acquired 
spectra  are  mixture  mass  spectra  and  none  of  then  should  be  used  as 


pure  spectra  for  library  searches  and  other  subsequent  processing. 
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The  upper  edges  of  the  solution  bands  are  the  parent  mass  spectra  and 
should  be  used  for  qualitative  analysis.  For  quantitative  analysis  the 
points  defining  the  upper  edges  of  the  solution  bands,  i.e. 
and  (^y2’'^u2^  Figure  1,  should  be  used  to  calculate  molar  compositions. 

Case  III.  The  widths  of  the  solution  bands  are  not  equal  to  zero 
and  specific  signals  are  not  known:  This  is  the  most  challenging 
situation.  When  specific  signals  are  not  present  finite  solution 
band  widths  will  result  even  if  the  chromatographic  resolution  is  high. 

The  analyst  can  use  the  average  spectra  for  library  searches  after 
excluding  any  signal  with  an  unacceptably  large  range(4).  It  is 
recommended,  however,  that  the  lower  edges  of  the  solution  bands  (the 
purest  acquired  spectra)  be  first  examined  to  decide  whether  or  not 
they  are  indeed  the  parent  spectra  or  at  least  good  estimates.  As 
mentioned  earlier,  this  can  be  accomplished  by  examining  the 
chromatogram  or  the  band  widths  themselves.  The  evaluation  of  the 
lower  edges  of  the  solution  bands  can  be  very  advantageous.  If  they 
are  found  to  adequately  estimate  the  parent  spectra  then  they  can  be 
used  for  qualitative  analysis.  Quantitative  analysis  is  accomplished  by 
using  their  loadings  (i.e.  (aj^,bj^)  and  (a^,b^)  in  Figure  1)  to  calculate 
molar  compositions.  If  the  lower  edges  of  the  solution  bands  are 
not  found  to  be  adequate  estimates  of  the  parent  spectra,  the  analyst 
can  calculate  intervals  for  the  molar  compositions  using  both  the 
upper  and  the  lower  edges  of  the  solution  bands.  The  GC  peak  is, 
consequently,  resolved  twice  and  the  chromatographic  response  of 
each  component  is  represented  by  a  region  that  contains  the  true 
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chromatogram  and  establishes  an  interval  for  its  area. 

In  the  above  treatment  of  experimental  data  the  specific 
signals  were  removed  from  all  of  the  mass  spectra.  As  a  result, 
finite  band  widths  x/ere  obtained  when  CR  was  applied  to  all  the  three 
cases  shown  in  Figure  4.  The  solution  bands  are  used  to  calculate  an 
interval  for  the  area  of  each  individual  chromatogram.  Table  IX 
lists  intervals  of  areas  of  the  benzene  and  the  methylcyclohexane  signals 
for  each  case  shown  in  Figure  4.  The  mean  of  each  interval  shown  in  Table 
IX  is  a  good  estimate  (within  less  than  3%)  of  the  corresponding  value 
listed  in  Table  VI.  The  upper  and  lower  bounds  of  the  areas  of 
the  GC  signals  define  the  concentration  ranges  of  the  individual 
components  in  the  sample.  Calculating  intervals  for  the  individual 
GC  peaks  is,  of  course,  a  "last  resort"  solution.  It  is  important  to 
realize  that  the  analyst  can  use  CR  to  estimate  intervals  for  the 
concentrations  of  the  components  even  when  they  are  completely 
inseparable  (the  method  of  linear  least  squares  regression  can 
not  be  used)  and  no  information  about  their  mass  spectra  is  available 
(LIM,  SIM  and  curve  fitting  of  MCs  are  inapplicable). 
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CONCLUSION 

When  the  analyst  uses  the  multivariate  curve  resolution  method 

described  in  this  paper,  the  full  amount  of  useful  information  that 

the  instrument  can  deliver  is  realized. 

Accurate  qualitative  and  quantitative  analyses  by  GCMS  are  possible 

in  the  absence  of  complete  chromatographic  resolution.  With  a 
separation  equivalent  to  only  one  mass  spectral  scan  cycle  (only 

one  pure  mass  spectrum  is  acquired  for  each  component) ,  CR  offers 

a  powerful  resolution  tool  that  does  not  require  limiting  assumptions. 

Presently,  the  method  is  applicable  to  binary  mixtures.  Work  in 

progress  Involves  application  of  the  methods  to  other  chromatographlc- 

spectrometric  systems  (e.g.  LC-UV)  and  a  generalization  to  more  complex 

cases. 
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Table  II.  Percent  error  in  the  calculated  areas  of  a-pinene 
and  pinane  chromatograms  at  4a  separation. 
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Table  III.  Percent  error  in  the  calculated  areas  of  a  Gaussian 
profile  (A2)  proceeded  by  a  small  qatrnia  profile  (A,) 
Noise  and  4a  separation 
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Table  IV. 


Percent  error  in 
(A-j)  followed  by 
4a  separation. 


the  calculated  areas  of  a  gamma  profile 
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Table  VI.  Areas  (arbitrary  units)  of  resolved  TIC  peaks  and  soecific 
Mass  Chromatograms  of  n-heptane  (normal  heptane)  and 
methyl  cyclohexane. 
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Table  VI  I.  Correlation  between  MC  and  CR  and  between  MC  and  GI  for 
the  areas  of  the  n-heptane  signals. 
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Table  VIII.  Correlation  between  MC  and  CR  and  between  MC  and  GI  for 
the  areas  of  the  methyl  cyclohexane  signals. 


MC  and  CR 

MC  and  GI 

Regression 
Coefficient, 
b  (standard 
deviation! 

0.448"  (1.58x10"^) 

0.448^^(2.35x10"^) 

Correlation 

Coefficient, 

1.0000 

1.0000 

r 

0.99 

Confidence 
interval  for 

[0.9994  -  1.0000] 

[0.9983  -  1.0000] 

P 

"model :  A^(.= 

^model:  A,i-=  BA^, 

rlL  a  1 


Table  IX.  Intervals  for  t^’e  areas  (arbitrary  units)  of  the  resolved 
TIC  peaks  of  n-heptane  and  methylcyclohexane. 


Case 

n-heptane 

methyl  cyclohexane 

a 

[190442  -  202620] 

[114174  -  126500] 

b 

[120811  -  132005] 

[79568  -  90762] 

/ _ 

[70524  -  83947] 

[50776  -  64199] 

Figure  Captions 


Fig.  1:  Four  pairs  of  loadings  representing  four  mass  spectra  in  the 
factor  loadings  space.  The  points  (a^,b^)  and  (a^.b^)  define 
the  lower  edges  of  the  solution  bands.  Similarly  the  points 
(aui.bui)  and  (3y2»bjj2)  designate  the  upper  edges  of  the 
solution  bands. 

Fig.  2:  Simulated  coelution  of  a-pinene  and  pinane  at  0.0a  separation 
The  solid  lines  are  the  theoretical  profiles  and  their  sum. 
The  O  s  amd  the  3  s  are  the  calculated  individual  responses. 
Fig.  3:  Simulated  coelution  of  a-pinene  and  pinane  at  4.0a  separation 
The  solid  lines  are  the  theoretical  profiles  and  their  sum. 
The  Cs  and  thel3s  are  the  calculated  individual  responses. 
Fig.  4:  The  coelution  of  n-heptane  and  methyl  cyclohexane  at: 

(a)  85°C  and  a  flow  rate  of  33.2ml/minute, 

(b)  104°C  and  a  flow  rate  of  33.3ml/minute  and 

(c)  130°C  and  a  flow  rate  of  34ml/minute. 
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